Communication-Efficient Parallel Dense LU Using a3-Dimnsional Approach
نویسندگان
چکیده
We present new communication-efficient parallel dense linear solvers: An LU factorization algorithm and a triangular linear solver. The new algorithms perform asymptotically a factor of P 1/6 less communication than existing algorithms, where P is the number of processors . The new algorithms employ a 3-dimensional (3D) approach, which has been previously applied only to matrix multiplication. We have implemented and tested the new algorithms. Our LU factorization algorithm is competitive with ScaLAPACK and scales better with the number of processors. The new algorithms employ a 3D approach that reduces communication using replication. The algorithms perform less communication but use more temporary storage than existing algorithms, which all use a 2-dimensional (2D) approach. Until now, the 3D approach has only been used for parallel matrix multiplication in algorithms that were proposed by Berntsen [3], by Aggarwal, Chandra, and Snir [2], by Gupta and Kumar [5], by Johnsson [7], and by Agarwal, Balle, Gustavson, Joshi, and Palkar [1]. 3D algorithms work by distributing the 3D iteration space of the computation among processors. Matrix-matrix computations that can be implemented using three nested loops have a natural representation on a 3D grid in which every grid
منابع مشابه
Trading Replication for Communication in Parallel Distributed-Memory Dense Solvers
We present new communication-efficient parallel dense linear solvers: a solver for triangular linear systems with multiple right-hand sides and an LU factorization algorithm. These solvers are highly parallel and they perform a factor of 0.4P1/6 less communication than existing algorithms, where P is number of processors. The new solvers reduce communication at the expense of using more tempora...
متن کاملGreen and efficient synthesis of propargylamines via A3 coupling reaction using a copper (II)–thioamide combination
A one pot green three‐component coupling reaction of aldehyde, phenylacetylene, and amine derivatives in the presence of copper (II)–thioamide combination as a novel and efficient heterogeneous catalyst under solvent–free conditions is reported. The catalyst displayed high activity and afforded the corresponding propargylamines in good to high yields. The key to this procedure was the generatio...
متن کاملTEL-AVIV UNIVERSITY RAYMOND AND BEVERLY SACKLER FACULTY OF EXACT SCIENCES SCHOOL OF MATHEMATICAL SCIENCES A 3D Parallel Communication-Efficient Dense Linear Solver
We present new communication-efficient parallel dense linear solvers: a solver for triangular linear systems with multiple right-hand sides and an LU factorization algorithm. These solvers are asymtotically work efficient and they perform a factor of P 1/6 less communication than any existing algorithm, where P is number of processors. In other words, these solvers are likely to run faster than...
متن کاملTowards dense linear algebra for hybrid GPU accelerated manycore systems
0167-8191/$ see front matter 2010 Elsevier B.V doi:10.1016/j.parco.2009.12.005 * Corresponding author. Tel.: +1 865 974 8295; fa E-mail addresses: [email protected] (S. Tomov We highlight the trends leading to the increased appeal of using hybrid multicore + GPU systems for high performance computing. We present a set of techniques that can be used to develop efficient dense linear algebra alg...
متن کاملCommunication-optimal Parallel and Sequential QR and LU Factorizations
We present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform and just as stable as Householder QR. We prove optimality by deriving new lower bounds for the number of multiplications done by “non-Strassen-like” QR, and using these in known communication lower bounds that are proportional to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001